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• ■ Abstract 

^ ' We examine questions involving nondeterministic finite automata where all states 

are final, initial, or both initial and final. First, we prove hardness results for the 
^ . nonuniversality and inequivalence problems for these NFAs. Next, we characterize the 

i languages accepted. Finally, we discuss some state complexity problems involving such 

^ ' automata. 

(N 

O 1 Introduction 
00 ■ 

^ ' Nondeterministic finite automata (NFAs) differ from deterministic finite automata (DFA) 
in at least two important ways. First, they can be exponentially more concise in expressing 
^ ■ certain languages, as it is known that there exist NFAs on n states for which the smallest 
■ equivalent DFA has 2" states [5, 12, 10]. Second, while it is possible to test inequivalence 
and nonuniversality for DFAs in polynomial-time, the corresponding problems for NFAs are 
PSPACE-complete [11, Lemma 2.3, p. 127]. 

In this paper, we consider NFAs with certain natural restrictions, such as having all states 
final, all states initial, or all states both initial and final. Although imposing these conditions 
significantly narrows the class of languages accepted (see § 6), we show that there is still 
an exponential blow-up in converting to an equivalent DFA, and the corresponding decision 
problems are still PSPACE-complete. Furthermore, these restricted NFAs are intimately 
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related to languages that are prefix-closed, sufiix-closed, or factor-closed (see § 6 and [3]), 
and have close connections with certain decision questions on infinite words and a decision 
problem on Boolean matrices [13]. 

Here is a brief outline of the paper. In Section 2, we give some basic definitions and 
notation. In Sections 3, 4, and 5, we prove an assortment of hardness results on NFAs 
with various restrictions on their initial and final states. In Section 6, we give a simple 
characterization of the languages accepted by NFAs with these restrictions. In Sections 7 
and 8, we give our main results, on state complexity. We end with Sections 9 and 10, where 
we discuss the complexity of complement and the length of the shortest word not accepted. 

2 Definitions and notation 

We recall some basic definitions. For further details, see [8]. A non- deterministic finite 
automaton (NFA) M is a quintuple M = {Q, S, S, I, F), where Q is a finite set of states; S 
is a finite alphabet; 5 : Q x S — > 2*5 is the transition function, which we extend to Q x S* 
in the natural way; I C. Q is the set of initial states^; and F C Q is the set of final states. 
An NFA M accepts a word w E T^* ii 5{I,w) D F 0. The language of all words accepted 
by M is denoted L(M). 

A deterministic finite automaton (DFA) M is defined as an NFA above, with the following 
restrictions: M has only one initial state go, and \S{q, a)\ = 1 for all g e Q and a G S. 

The state complexity of a regular language L is the number of states in the minimal DFA 
accepting L. Given an operation on regular languages, we also define the state complexity 
of that operation to be the number of states that are both sufficient and necessary in the 
worst-case for a DFA to accept the resulting language. 

3 Hardness results 

First, we discuss the case of NFAs with a single initial state, and where all states are final. 
Consider the following decision problem 

NFA-INEQUIVALEMCE-ASF(A;): Given two NFAs Mi and Ms, over an alphabet with k letters, 
each having the property that all states are final states, is L{Mi) ^ L(M2)? 

We will prove the following theorem. 

Theorem 1. NFA-INEQUIVALENCE-ASF (k) is PSPACE-complete for k>2, but solvable in 
polynomial time for k = 1. 

^In the "formal" definition of an NFA (e.g., [8, p. 20]), only one initial state is typically allowed. However, 
NFAs with multiple initial states can clearly be simulated by NFA-e's, and hence by NFAs with at most one 
more state. We find it useful to avoid the complication of e-transitions and simply allow having multiple 
initial states here. 
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Proof. First, let us consider the case where k — 1. Let M be a unary NFA (over the alphabet 
S = {a}) with all states final. Then L{M) is cither finite or E*, depending on whether there 
is a cycle in the directed graph G given by the transitions of M. Furthermore, if M has n 
states, then a" G L[M) iff G has a cycle reachable from q^. Therefore, we can determine 
L{M) efficiently by checking first if a" is accepted. If it isn't, we then successively check 
whether a"~^, a'^"^, . . .,a},e are accepted. If the first string in this list that is accepted is 
a*, then L{M) = {e, a, . . . , a*}. Thus we can check whether L{Mi) ^ L{M2) efficiently. 

The NFA-INEQUIVALENCE-ASF problem is in PSPACE, since the more general NFA- INEQUIVALENCE 
problem (problem ALl in Garey and Johnson [6, p. 265]) is well-known to be in PSPACE. 
A proof of this result can be found in Sipser [15, p. 315]. 

Now we need to see that NFA-INEQUIVALENCE-ASF is PSPACE-hard. To do so, we con- 
sider the speciahzation NFA-NONUNIVERSALITY-ASF: 

NFA-NONUNIVERSALITY-ASF(^): Given an NFA M over an alphabet E with k letters, having 
the property that all states are final states, is L(M) ^ E*? 

Clearly if we prove the stronger result that NFA-NONUNIVERSALITY-ASF(A;) is PSPACE- 
hard, then it will follow that NFA-INEQUIVALENCE-ASF(^) is PSPACE-hard, by choosing one 
of the NFAs to be the one-state NFA with a loop back to the single state on every input 
symbol. So it will suffice to prove the following lemma: 

Lemma 2. NFA-NONUNIVERSALITY-ASF fA;j is PSPACE-hard for k>2. 

Proof. First, let us consider the case where A; > 3. We reduce from the following decision 
problem, which is well-known to be PSPACEl-complete [6, p. 265] for k > 2: 

NFA-NONUNIVERSALITY(A;): Given an NFA M over an alphabet E with k letters, is L(M) ^ 
E*? 

Here are the details of the reduction. 

Given an NFA M over an alphabet of size k, we transform it to an NFA M' with all 
states final, over an alphabet of size A; -|- 1, as follows: M' is identical to M, except that we 
add a transition from each final state of M to the initial state qo on a new symbol, say #, 
and then we change all states to be final states. This construction is illustrated below in 
Figure 1. 

Let A = E U {#}. 

We now claim that L(M) 7^ E* iff L(M') ^ A*, or, equivalently, L(M) = E* iff L(M') = 

A*. 

Suppose L{M) = E*. Then for each string w G E*, there exists a final state p{w) of 
M such that p G d{qo,w). Let x G A*. If a; G E*, the result is clear. Otherwise write 
X — a;i#a;2#a;3# • • ■ #a;„, where each Xi G E*. Now there exists an accepting computation 
for X in M', which starts in qg, follows Xi to the state p{xi), then follows the transition on 
# back to go of M', then follows X2 to p{x2), etc. Thus L(M') = A*. 

Now suppose L{M') = A*. Then, in particular, M' accepts all strings of the form 
where w G E*. In order for M' to accept it must be the case that there is a transition 



3 




Figure 1: The transformation of M to M'. 

from a state p G 6{qQ,w) on ^ in M'. But then this state is final in M, by construction, so 
w is accepted by M. Thus L(M) = S*. 

This completes the reduction. Note that our construction increases the size of the alpha- 
bet by 1, so that we have shown that 

NFA-NONUNIVERSALITY(A;) reduces to NFA-NONUNIVERSALITY-ASF(A; + 1). 
Since NFA-NONUNIVERSALITY is PSPACE-hard for /c > 2, we have proved the lemma for 
A; > 3. 

It remains to show NFA-N0NUNIVERSALITY-ASF(2) is PSPACE-hard. To do this, we show 
by recoding that NFA-N0NUNIVERSALITY-ASF(4) reduces to NFA-NONUNIVERSALITY- ASF(2). 

Here are the details. Given a machine M over the input alphabet S = {0, 1,2,3} with 
all states final, we create a new machine M' over the input alphabet A = {0, 1}. Each 
transition out of a state A is recoded, and two new final states are introduced, so that 

• a transition on is replaced by a transition on followed by 

• a transition on 1 is replaced by a transition on followed by 1 

• a transition on 2 is replaced by a transition on 1 followed by 

• a transition on 3 is replaced by a transition on 1 followed by 1 
See Figure 2. 
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Figure 2: The transformation of M to M' . 



We claim that L(M) = S* iff L(M') = A*. □ 

This completes the proof of Theorem 1. □ 

Corollary 3. Minimizing an NFA with all states final, over an alphabet of size > 2, is 
PSPACE-hard. 

Proof. If we could minimize an NFA with all states final, we could also solve the nonuniver- 
sality problem L(M) ^ E* as follows: first we minimize the NFA. If it has > 2 states, we 
say "yes" . Otherwise we inspect the transitions (if any) of the minimized NFA, and check if 
the single state is final and that there is a loop on every element of the alphabet. If so, we 
say "no" ; otherwise, we say "yes" . □ 



4 Generalized NFA with all states initial 

Now we consider a variant of the problems considered in Section 3. These variants concern 
generalized NFAs with multiple initial states allowed, in which all states are initial states 
and there is only one final state. We consider the following decision problems: 

NFA-INEQUIVALENCE-ASl(/c): Given two NFAs Mi and M2, over an alphabet with k letters, 
each having the property that all states are initial states and only one state is final, is 
L(Mi) ^ L(M2)? 
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NFA-NONUNIVERSALITY-ASI(^): Given an NFA M over an alphabet E with k letters, having 
the property that all states are initial states and only one state is final, is L{M) ^ E*? 

We prove the following theorem. 

Theorem 4. 5o^/i NFA-INEQUIVALENCE-ASlffc; and NFA-NONUNIVERSALITY-ASIf^fcy) are PSPACE- 
complete for alphabet size k > 2, but solvable in polynomial time for k — 1. Furthermore, 
minimizing an NFA with all states initial and one state final is PSPACE-hard for k >2. 

Proof. These results follow trivially from the results in the previous section by observing 
that L is accepted by an NFA M with a single initial state and all states final iff (the 
language formed by reversing all the strings of L) is accepted by M^, the generalized NFA 
formed by reversing all the transitions of M, and changing initial states into final and vice 
versa. □ 



5 All states both initial and final 

Our original motivation in Section 1 involved generahzed NFAs where all states are both 
initial and final. Consider the following decision problem: 

NFA-INEqUIVALENCE-ASIF(A;): Given two NFAs Mi and Mg, over an alphabet with k letters, 
each having the property that all states are both initial and final, is L{Mi) ^ L(M2)? 

Theorem 5. NFA-INEQUIVALENCE-ASIF (k) is PS PACE- complete for k>2, but solvable in 
polynomial time for k — 1. 

Proof. The idea is similar to that in the proof of Theorem 1. We only indicate what needs 
to be changed. 

Once again, we work with the "easier" problem 

NFA-NONUNIVERSALITY-ASIF(A;): Given an NFA M over an alphabet with k letters, having 
the property that all states are both initial and final, is L{M) 7^ E* ? 

We can show that NFA-NONUNIVERSALITY(A;) reduces to NFA-NONUNIVERSALITY-ASIF(A; + 
1) using a simple variant of our previous proof. Given M, an NFA over an alphabet E of 
k symbols, we modify it to obtain M', an NFA over an alphabet A = E U {#} of /c + 1 
symbols, as follows. First, we delete all states of M not reachable from qq, the start state. 
Next, we introduce a new symbol 7^ and transitions on # from each of the final states of M 
to go- Finally, we change all states to be both initial and final. We claim that L(M) — E* 
iff L(M') = A*. 

The direction L(M) = E* =^ L{M') ~ A* is exactly as before. For the other direction, 
suppose L{M') — A*. Then, in particular, for all x e E*, the string is accepted by 

M'. Consider an accepting path for this string in M'. It starts at some state (since all states 
are initial) and then follows a transition on # to go- The machine M' now processes x and 
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arrives at some state q. In order for M' to reach a final state on the last symbol, #, there 
must be a transition on ^ from g to go- But this can only be the case if q was final in M. 
Thus we have found an accepting path for x in M, and so L(M) = S*. 

Thus we have shown NFA-NONUNIVERSALITY-ASIF(A;) is PSPACE-complete for A; > 3, 
and thus, that NFA-INEQUIVALENCE-ASIF(A;) is PSPACE-complete for A; > 3. 

To complete the proof of the theorem, we prove the following lemma. 

Lemma 6. NFA-N0NUNIVERSALITY-ASIF(2) is PSPACE-complete. 

Proof. It is enough to show that NFA-N0NUNIVERSALITY-ASF(3) reduces to 
NFA-N0NUNIVERSALITY-ASIF(2). The reduction has several steps, but the basic idea is simply 
to recode the 3-letter alphabet {0, 1, 2} into strings over a 2-letter alphabet {1, 10, 100}. 

Given an NFA M with input alphabet S = {0,1,2}, a single initial state qa, and all 
states final, we first modify M to enforce the condition that there be no transitions entering 
the initial state. To do this, we double the initial state, adding a new state po with the same 
outgoing transitions as qo, and make any transitions formerly entering go to enter po instead. 

Second, we enforce the condition that the labels of all transitions entering a particular 
state be the same. To do this, wc triple each state except the initial state (which, by con- 
struction, now has no incoming transitions), copying the outgoing transitions, and assigning 
an incoming transition of each element of S to one of the three states, appropriately. 

Third, we recode the transitions of the NFA, as follows: 

gets recoded as 1 
1 gets recoded as 10 
2 gets recoded as 100 

Of course, this recoding necessitates introducing intermediate states for transitions on 1 and 
2. We call these intermediate states "new" and all other states "old". 

The incoming transitions of each old state have the same labels, which are either 1, 10, 
or 100. In our fourth step, wc add additional outgoing transitions, and states, as depicted in 
Figure 3. The dotted transitions indicate transitions that include some nondepicted states, 
and the dashed circles indicate the additional states added. The effect of these additional 
transitions is to allow, from each old state with an incoming arrow, a path labeled by 1 and 
then 3 or more zeroes that returns to go. 
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Figure 3: Additional outgoing transitions 



Finally, we make all states both initial and final. Call the resulting generalized NFA M' . 
We claim M accepts E* iff M' accepts A*, where A = {0, 1}. Define the morphism h by 
^ 1, 1 ^ 10, and 2 ^ 100. 

Suppose M accepts S*. Wc need to show that every s e A* is accepted by M' . Let 
us identify the maximal blocks of 3 or more zeroes in s, if they exist. These blocks cither 
mark the beginning or end of s, or else are bounded on the left by a string specified by 
(e + + 00) (1 + 10 + 100)*1, and on the right by a string specified by (1 + 10 + 100)+. Thus 
every string in A* has one of the following forms: 

(a) y 

(b) yw 

(c) {zx)*z 

(d) yx{zx)*z 

(e) {zx)*zw 

(f) yx{zx)*zw 

where y = {e,0,00}, x = {1, 10, 100}*1, z = {000}{0}*, and w = {1,10,100}+. For forms 
(a)-(f), we argue that each string s specified is accepted by M'. We do this only for part (f), 
as the others are similar. 

Let s e A*. We show how to construct an accepting path for s in M', where s is of the 
form yx{zx)*zw. Write s — y'xoZoXi • • • Zn-iXn-iZnw' , where y' is a string of y, each Xi is a 
string of x, each Zi is a string of z, and w' is a string of w. 
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First, consider an accepting path for 2h~^{xo) in M. This path corresponds to a path in 
M' starting at and visiting a sequence of old states in turn. In particular, the path for 
the prefix 2 corresponds in M' to a sequence of transitions on (successively) 1,0,0, leading 
to an old state. Call the sequence of states encountered qi, q2, qs- Since every state of M' is 
initial, we can choose to start at our accepting path at 

• qi (if the string s we are trying to accept starts with 001); 

• q2 (if the string s we are trying to accept starts with 01); 

• ^3 (if the string s we are trying to accept starts with 1). 

Thus there is a path in M' starting at either qi, q2, or ^3, processing y'xo, and ending in 
an old state. At this point we can read Zq, which leads back to go- It now remains to 
construct a path for xiZi • ■ -Xn-iZnw'. Again, there is path from q^ in M on h''^{xi), and 
this corresponds to a path in M' leading to an old state. We can now process the symbols of 
Zi, leading back to go- This process continues until after reading Zn we have returned once 
more to go. At this point we can process the symbols of w', and we are in an accepting state. 
Thus M' accepts s. 

For the other direction, assume M' accepts A*. We must show M accepts E*. Clearly M 
accepts e, since M has an initial state and all states are final. Now let s e E"*", and consider 
the string 1000/i(s)l in A*. This string is accepted, and so there is an accepting path starting 
in some state (not necessarily go) for it in M'. By our construction, after reading 1000, we 
are either in go or in some new state. If we are in a new state, however, there is no possible 
transition on 1, so we must be in go after reading 000. Now an acceptance path for h{s)l 
from go corresponds to an acceptance path for sO, and hence s, in M. (We require the final 
1 because otherwise if s ends in 0, we could be in a new state of M' which would not map 
back to a path in M.) □ 

This completes the proof of Theorem 5. □ 

Corollary 7. Minimizing an NFA with all states both initial and final is PSPACE-hard. 



6 Characterization of the languages accepted by spe- 
cial NFAs 

In this section we observe that the languages accepted by the kinds of NFAs we have been 
discussing have a simple characterization. 

We define pref(L) to be the language of all prefixes of strings of L, suff(L) to be the 
language of all suffixes of strings of L, and fact(L) to be the language of all factors (aka 
"subwords") of strings of L. A language L is prefix-closed \i L — pref(L), suffix-closed if 
L — suff(L), and factorial li L — fact(L). 

The results summarized in the following theorem are easy to prove. Part (b) was noted 
by Gill and Kou [7]. 
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Theorem 8. (a) A nonempty regular language is prefix-closed if and only if it is accepted 
by some NFA with all states final; 

(h) A nonempty regular language is suffix-closed if and only if it is accepted by some gen- 
eralized NFA with all states initial and one final state. 

(c) A nonempty regular language is factorial if and only if it is accepted by some generalized 
NFA with all states both initial and final. 

It is natural to consider the complexity of testing whether a given regular language 
is prefix-closed, suffix-closed, or factorial. We will see below that the answer depends on 
whether the input is given as an NFA or a DFA. 

Theorem 9. The following problems are PSPACE- complete: given an NFA M , decide if 
L[M) is not prefix- closed (resp. suffix-closed, factorial). 

Proof. To show that determining if L{M) is not prefix-closed is in PSPACE, we first give a 
non-deterministic algorithm. The desired result will then follow by Savitch's Theorem. Let 
n be the number of states of M. If L{M) is not prefix-closed, there exists a string w e L{M) 
such that some prefix w' of w is not in L{M). We guess such a w of length < 2"+^ one 
input symbol at a time and verify that w is accepted by M but some prefix w' is not. The 
space required is that for the current set of states of M and for an n + 1 bit counter, which 
is clearly polynomial. It remains to show that if such a w exists, we may choose w to have 
length < 2"+^. Suppose the shortest such w has length > 2"+^. Let w' be the prefix of w 
not accepted by M. During the computation of M on the first 2" symbols of w, M must 
repeat a set of states, and similarly for its computation on the second 2" symbols of w. If 
w' has length > 2", then omitting the portion of the computation between the repeated set 
of states in the first half of w yields a new, shorter string accepted by M with a prefix not 
accepted by M, contradicting the minimality of w. If w' has length < 2", then omitting the 
portion of the computation between the repeated set of states in the second half of w gives 
the same result. We conclude that a shortest such w has length < 2"+^. 

A similar argument shows that determining if L{M) is not suffix-closed is also in PSPACE. 
Noting that fact(L) = suff(pref (L)), one concludes that determining if L(M) is factorial is 
also in PSPACE. 

To show PSPACE-hardness we use the reduction from the acceptance problem for polynomial- 
space bounded Turing machines to MFA-NONUNIVERSALITY given by Aho, Hopcroft, and 
UUman [1, Section 10.6]. Given a deterministic Turing machine T and an input w, Aho, 
Hopcroft, and UUman [1, Section 10.6] showed how to construct a regular expression E 
specifying all strings that do not represent an accepting computation oi T on w. Prom E 
we can construct an NFA M for L{E) in polynomial space using the standard constructions. 
Thus if T does not accept w, the NFA M accepts all strings over its input alphabet E. If 
T does accept w, then M accepts all strings except the one string x that represents the ac- 
cepting computation of T on w. But now if L{M) = S*, then L{M) is clearly prefix-closed, 
suffix-closed, and factorial. If L{M) = then L(M) is not prefix-closed, suffix-closed. 
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or factorial. Thus L{M) = E* iff L{M) is prefix-closed (resp. suffix-closed, factorial). Since 
the problem of deciding if L{M) ^ S* is PSPACE-complete, we conclude that deciding if 
L{M) is not prefix-closed (resp. suffix-closed, factorial) is PSPACE-complete. □ 

Theorem 10. The following problems can he solved in polynomial time: given a DFA M , 
decide if L{M) is not prefix-closed (resp. suffix- closed, factorial). 

Proof. Given a DFA M we may easily construct a DFA M' accepting pref(L) by making 
final every state in M that can reach a final state. To test if L{M) is not prefix-closed 
is to test the non-emptiness of L{M') \ L{M), which is easily done in polynomial time by 
the cross-product construction and the standard algorithm for testing the emptiness of a 
language accepted by a DFA. 

To determine if L(M) is not suffix-closed, first let M = {Q,T,,d,0, F), where Q = 
{0, . . . , n — 1}. We construct at most n new DFAs Mj, < i < n — 1, where i is a state 
of M reachable from and Mj is identical to M except that i is the start state of Mj. We 
now test if any of the Mj accept a string not accepted by M. As before, this can be done in 
polynomial time for each Mj, and since we have at most n machines Mj, the overall runtime 
is polynomial. 

To determine if L[M) is not factorial, we construct the Mj as above, but now for each 
Mj we make final every state of Mj that can reach a final state. Again, we now test if any 
of the Mj accept a string not accepted by M. □ 

For more exact analysis of the running time, see [3]. 

7 State complexity results 

We now turn to state complexity results. It is well known that, for all n > 1, there exists an 
NFA with n states such that minimal equivalent DFA has 2" states. In this section we show 
that the maximum blow-up can still be achieved for alphabets of size > 2, if we demand 
that all states be final, initial, or both initial and final. We note that in computing the 
state complexity, we demand that our DFAs be complete, that is, that there is a well-defined 
transition from every state and every input symbol. 

The situation is somewhat different for the unary case, with alphabet S = {a}. In the 
case of an NFA with all final states, the maximum blow-up in going from an NFA to a DFA 
is n — > n -I- 1 states. To see this, note that if a unary n-state NFA with all final states has 
a directed cycle, then it accepts a*, which can be done with a 1-state DFA. Otherwise there 
exists a k < n such that is the shortest string not accepted. This can be accepted with 
a /c + f -state DFA (by adding the missing dead state). In the case k = n, this results in a 
n ^ n + 1 blowup. The same results occur for NFAs with all states initial and one final, or 
with all states both initial and final. 

Now we turn to the case of larger alphabets. 

Theorem 11. For n — 1 and every n > 3 there exists an NFA M over a binary alphabet 

with n states, all of which are final, such that the minimal DFA accepting L(M) has 2" 
states. No such binary NFA exists for n — 2, although over a ternary alphabet one exists. 
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Proof. For n = 1 we take the automaton with a single state which is both initial and final, 
with a self-loop on only one of the two letters. 

For n = 2 we can enumerate all possible binary NFAs with all states final and check that 
none of them have a minimal DFA with 4 states. 

It is easy to verify that the ternary NFA in Figure 4 has deterministic state complexity 

4. 
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Figure 4: A two-state NFA with both states final where the minimal equivalent DFA has 4 
states. 

Now assume n > 3. We define an NFA M = (Q, S, 5, 0, F) (Figure 5), where Q ~ 
{0, . . . , n — 1}, S = (0, 1}, F = Q, and for any i, < i < n — 1, 

{{i + 1}, if a = and < i < n - 3; 
{n — 1}, if a = and i — n — 1; 
{0,i + l}, if a = 1 and < i < n -2. 
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Figure 5: The NFA M of Theorem 11. 

Let M' = (2*5, S, 5', {0}, F') be the DFA obtained by applying the subset construction 
to M. To show that M' is minimal we will show (a) that all states of M' are reachable, 
and (b) that the states of M' are pairwise inequivalent with respect to the Myhill-Nerode 
equivalence relation. 

To prove part (a) let 5" C Q be a state of M', where S — {si, S2, . . . , s^} for some k and 
Si < 8-2 < ■ ■ ■ < Sfc. There are two cases to consider. 
Case 1: n — 1 ^ S. Then 

S'{{0}, o'k-sk-i-iiQSk-i-sk-2-ii . . . 0^2-^1-110^1) = S. 
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To see this, let Wk — e and for 1 < i < A; — 1, let 

For 1 < i < A;, let Si = S'{{0},Wi). Then Si_i = + Si - U {0}. We see that 
Si = {si < Sj+i < ■ ■ ■ < Sk} — Si. Here, for m & Q, the notation S + m refers to the set 
{x + m : X & S}. Thus 

5'({0}, wiO'') = S'{Si, 0^1) ^Si + si^S, 

as required. 

Case 2: n — 1 e 5". By the argument of Case 1, S' \ {n — 1} is reachable. But then 

s'{s \{n- 1}, o''-^-"'-no"^-'-"'-'-h ■ ■ ■ o''-''-Ho'') = s. 

To see this, for 1 < i < A; — 1, let 

For 1 < i < A; - 1, let Si = 5'{S \{n- 1}, Wi). Then 

Si-i = ((^i \ {n - 1}) + Si- Si-i) mod (n - 1)) U {n - 1}. 

We see that 

Si = ((5\{n-l})+n-l-Si)mod(n-l))U{n-l} 
= {{S \{n - 1})- Si) mod {n-l))U{n-l}. 

Thus 

5'(5\{n- l},wiO"0 = (5'(5i,0*i) 

= {{S \{n- 1}) - + Si) mod (n - 1)) U {n - 1} 
= 5\{n-l}U{n-l} 
= ^, 

as required. 

To prove part (b) let S and T be distinct states of M'. We have 2 cases. 

Case 1: n — 1 is in exactly one of 5* or T. Without loss of generality, suppose n — 1 ^ S 
and n - 1 e T. Then S'{S, 0""^) = and S'{T, 0""^) = {n- 1}, so S and T are inequivalent. 

Case 2: either n — 1 is in both of S and T or n — 1 is in neither. Without loss of generality, 
suppose there exists i ^ S, i eT. Then ^^-^^ O"-^-'!) = S' and 5'{T,0''-^-n) = T', where 
n — 1 ^ S' and n — 1 e T'. We now apply the argument of Case 1. □ 

We now turn to the case where all states are both initial and final. 
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Theorem 12. For every n >1 there exists an NFA M over a binary alphabet with n states, 
each of which is both initial and final, such that the minimal DFA accepting L{M) has 2" 
states. 

Proof. For n — 1 we take the automaton with a single state which is both initial and final, 
with a self-loop on only one of the two letters. 

Now assume n > 2. We define an NFA M = {Q,I],6,Q, F) (Figure 6), where Q = 
{0, . . . , n — 1}, S = {0, 1}, F = Q, and for any i, < i < n — 1, 

{{i — 1}, if a = and 1 < i < n — 1; 
{n — 1}, if a = and i — 0; 
{i + 1}, if a = 1 and < i < n - 2. 








Figure 6: The NFA M of Theorem 12. 

Let M' = (2*^, E, 5', Q, F') be the DFA obtained by applying the subset construction 
to M. To show that M' is minimal we will show (a) that all states of M' are reachable, 
and (b) that the states of M' are pairwise inequivalent with respect to the Myhill-Nerode 
equivalence relation. 

To prove part (a) let -S" C Q be a state of M', where S — Q \ {si, S2, . . . , Sfe} for some k 
and Si < S2 < • • ■ < Sk- Then 

S'{Q, 0*l+40^2-"l+^l • • • O^k-Sk-l+^lQU-Sk^^ ^ g 

To see this, for 1 < i < /c, let 

Wi = o^i+^lO'^-'i+^l • • • O^^-^^-i+^l, 

and let Si — S'{Q,Wi). Then one easily verifies that 

= HQ \ S2, ■ ■ ■ , Si}) - Si) mod n, 

so 5'{Q, WkO''-'') = 5'{Sk, 0"-^'=) = {S-Sk-n + Sk) mod n = S. 

To prove part (b) let S and T be distinct states of M'. Without loss of generality, suppose 
there exists i^S,ieT. Then 6'{S, 0^1"-^) = and S'{T, O'r-^) = {n- 1}, so S and T are 
inequivalent. □ 
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Finally, we consider the case where all states are initial, and only one state is final. An 
example of maximal blowup from n states to 2" deterministic states was first given by Gill 
and Kou [7], but their construction was not over a fixed alphabet. Later, Veloso and Gill 
[16] gave an example over a binary alphabet. Here we give another example. The following 
NFA, which is a trivial variation on that in Figure 6, demonstrates the maximum blow-up 
from n states to 2" deterministic states for all n > 1. We omit the proof, which is a trivial 
variation of the proof of Theorem 12. 








Figure 7: The NFA demonstrating maximum blow-up for all states initial, one state final. 



8 State complexity of pref(L), suff(L), fact(L) 

In this section we consider the the state complexity of the operations pref(L), suff(L), and 
fact(L). 

If the state complexity of L is n, the state complexity of pref (L) is also at most n, as can 
be seen from the standard construction for pref(L) where we change every state from which 
a final state can be reached to final. 

The state complexity of suff(L) is more interesting. 

Theorem 13. Let M be a DFA with n states. Then suff (L(M)) can be accepted by a DFA 
with at most 2^ — 1 states, and this bound is tight. 

Proof. Let M = (Q, S, 6, 0, F), where Q = {0, . . . , n - 1}. Then suff (L(M)) is accepted by 
the generalized NFA N — {Q,Ti,5,P,F), where P C Q is the set of states reachable from 
the start state. But it is clear that the empty set is not reachable from any nonempty set of 
states of iV, so the minimal equivalent DFA has at most 2" — 1 states. 

To show the bound is tight, consider the DFA M = [Q, E, S, 0, F) on states Q = 
{0, 1, . . . , n — 1} (Figure 8) defined by 

S{q,0) = q, ioT < q < n- 1; 
S{n- 1,0) = 0; 

S{q, 1) — {q + 1) mod n, for < g < n; 

and with F = {0}. 

Now consider the generalized NFA = {Q,T,,S,Q, F). By the argument above, N 
accepts suff (L(M)). Let M' = {2^ \ ^,T.,5' ,Q,F') be the DFA obtained by applying the 
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0,1 

Figure 8: The DFA M of Theorem 13. 



subset construction to and removing the empty set. To show that M' is minimal we 
will show (a) that all states of M' are reachable, and (b) that the states of M' are pairwise 
inequivalent with respect to the Myhill-Nerode equivalence relation. 

To prove part (a) let 5" C Q be a state of M', where S — Q \ {si, S2, . . . , s^} for some k 
and si < S2 < ■ ■ ■ < Sfe. Let T C.Q,T If both t and t-\-l are m.T,t <n — l, then one 
easily verifies that 

5'(T, r-^-*01*+^) =T\{t}. 

For < i < n — 1, define Wi = i^-i-^gi*"*"^. We have two cases. 
Case 1: Sk ^ n — 1. We see that 

d'{Q, Ws^Ws^ ■ ■ ■ Ws^) = (5 \ {si, S2, ■ ■ ■ , Sk} = S, 

as required. 

Case 2: — n — 1. Let T — Q \ {si, S2, ■ ■ ■ , Sfe-i}, where Si < S2 < • • • < Sfe-i- By the 
argument of Case 1 

5'{Q,Ws^Ws^ ■ ■•Ws^_^) = T. 

Since 7^ 0, there exists a smallest teT,t^n-l. If i = 0, then 5'(T, 0) ^T\{n-l} = -S. 
Otherwise, 

5'(T, 0(l"-iO)*-il*-i) = T \ {n - 1} U {t - 1} = T'. 

But now 

5'{T\wt.,)=T\{n-l} = S, 

as required. 

To prove part (b) let S and T be distinct states of M' . Without loss of generality, suppose 
there exists i ^ S, i E T. The set of final states F' consists of all subsets of Q containing 0. 
But ^ S'{S, 1"-^) and e S'{T, 1"-^), so S and T are inequivalent. □ 

We now turn to the state complexity of fact(L): 

Theorem 14. Let M be a DFA with n states. Then fact(L(M)) can he accepted by a DFA 
with at most 2"~^ states, and this bound is tight. 
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Proof. Let M = (Q, S, 5, 0, F), where Q — {0, . . . ,n — 1}. Let us assume that M con- 
tains no unreachable states. Suppose that every state of M can reach a final state. Then 
fact(L(M)) = S* and is accepted by a one state DFA. Let us suppose then that there exists 
q E Q such that q cannot reach a final state. Then we may remove the state q and any 
associated transitions to obtain a equivalent NFA with n — 1 states. Then fact (L(M)) is 
accepted by the generahzed NFA N — {Q\ {q}, E, 5, Q \ {q}, P), where P C. Q is the set of 
states that can reach a final state. The minimal DFA equivalent to N thus has at most 2"^"^ 
states. 

To show the bound is tight, consider the DFA M on states Q = {0,1, . . . , n — 1} (Figure 9) 
defined by 

6{q, 0) = g, for < g < n - 2; 

S{q,0) = n — 1, for g = n — 2, n — 1; 

6{q, 1) = (g + 1) mod n, for < g < n — 2; 
(5(n-2, 1) = 0; 
S{n — 1,1) = n — 1; 

and with F = {Q}. 




1 

Figure 9: The DFA M of Theorem 14. 

Note that state n — 1 cannot reach a final state. Let M be the NFA obtained by removing 
state n — 1 from M, along with all associated transitions. Let N be the generalized NFA 
obtained from M by making all states both initial and final. Then accepts fact(L(M)). 
Let g' = {0, . . . , n - 2}. Let M' = (2'?', E, S', Q', F') be the DFA obtained by applying the 
subset construction to N . To show that M' is minimal we will show (a) that all states of 
M' are reachable, and (b) that the states of M' are pairwise inequivalent with respect to the 
Myhill-Nerode equivalence relation. 

To prove part (a) let S* C be a state of M' , where S = Q'\ {si, S2, ■ ■ ■ , Sk) for some k, 
and si < S2 < ■ ■ ■ < Sk- One easily verifies that for any T C Q' and t e Q', 

S'{T, i»-2-«oi*+i) = T \ {t}, 

from which it is clear that S is reachable. 

To prove part (b) let S and T be distinct states of M'. Without loss of generality, suppose 
there exists i ^ S , i E T. Then by the argument of part (a), there exists a string w such 
that 5'{S, w) = and 5'{T, w) = {i}, so S and T are inequivalent. □ 
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9 Nondeterministic state complexity of complement 



We now consider the following question. Let M be an NFA with all states final, accepting a 
language L. What is the maximum size of a minimal NFA accepting L? 

The case where we remove the restriction that all states be final was previously studied 
by Sakoda and Sipser [14], Birget [2], Ellul ct al. [4], and Jiraskova [9]. Jiraskova constructed 
an n state NFA N over the alphabet {0, 1} such that any NFA accepting L{N) requires at 
least 2" states. 

Jiraskova's NFA is defined as follows: let N — {Q, S, 6, 0, F), where Q = {0, . . . , n — 1}, 
E = {0, 1}, F — {n — 1}, and for any i, < i < n — 1, 

{{0,i + 1}, if a = and i < n - 1; 

{1, 2, . . . , n — 1}, if a = and i = n — 1; 
{i + 1}, if a = 1 and i < n — 1. 

By modifying this construction we prove 

Theorem 15. For n > 1, there exists an NFA M ofn+1 states over a three-letter alphabet 
with all states final such that any NFA accepting L{M) requires at least 2" states. 

Proof. Let be the NFA described above. Let Q' = {0,...,n} and let M = {Q', {0, 1, 2}, 6', 0, Q'), 
where for any i, < i < n, 




S{i, a), if a 7^ 2 and i < n; 
{n}, if a = 2 and i — n — 1. 



Then by modifying the fooling set argument of Jiraskova [9, Theorem 5] one obtains a fooling 
set of size 2" for L{M), giving the desired result. (One obtains the fooling set for L{M) by 
appending a 2 to the second word in each pair of the foohng set for L{N).) □ 



10 Shortest word not accepted 

Finally, we consider one more problem. Given an n-statc NFA M with all states final, such 
that L{M) 7^ E*, how long can the shortest unaccepted string be? At first glance it might 
appear that such a string has to be of length < n, but this is not the case. 

Theorem 16. There exists an n-state NFA M with all states final, such that the smallest 
string not accepted by M has length 2^"' for some constant < c < 1. 

Proof. In [4] the authors show that there exist n-state NFAs M over a 2-letter alphabet E 
such that the shortest string not accepted is of length 2"' for some constant < c < 1. We 
take such an NFA M, and add a new symbol, say with transitions on ^ from every final 
state of M back to M's initial state. Now make all states final. Call the resulting NFA M'. 
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Since M accepts e, its initial state is also final, and hence M' has a transition from its initial 
state to itself on ^. 

We claim that L{M') 7^ (S U {#})*, but the shortest string not accepted by M' is at 
least as long as that for M. Let w be the shortest string not accepted by M, of length N . 
Then either there is no path in M labeled or every path labeled w in M, arrives at a non- 
accepting state in M. In either case w# fails to be accepted by M' . On the other hand, M' 
accepts all strings shorter than since any shorter string w' is of the form tfi#tf2# ■ ■ ■ #'U^r 
for some strings Wi^W2., . . . iWr- G S*, where each w-i has length < N . Starting in the initial 
state of M', we read Wi, which is accepted by M since it is of length < \w\. If w' = Wi, then 
Wi is accepted by M' . Otherwise we follow the transition on # back to the initial state of 
M' and continue with u'2, etc. □ 

We can obtain a similar result for NFAs where all states are both initial and final. In 
this case, we again add a new symbol with transitions on # from every final state of M 
back to M's initial state, and then make all states both initial and final. Now we argue as 
above, except we consider the string instead. 
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